Search CORE

2,097 research outputs found

Emergence of Language with Multi-Agent Games: Learning to Communicate with Sequence of Symbols

Author: Havrylov Serhii
Titov Ivan
Publication venue
Publication date: 26/04/2017
Field of study

Optimizing Differentiable Relaxations of Coreference Evaluation Metrics

Author: Le Phong
Titov Ivan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Coreference evaluation metrics are hard to optimize directly as they are non-differentiable functions, not easily decomposable into elementary decisions. Consequently, most approaches optimize objectives only indirectly related to the end goal, resulting in suboptimal performance. Instead, we propose a differentiable relaxation that lends itself to gradient-based optimisation, thus bypassing the need for reinforcement learning or heuristic modification of cross-entropy. We show that by modifying the training objective of a competitive neural coreference system, we obtain a substantial gain in performance. This suggests that our approach can be regarded as a viable alternative to using reinforcement learning or more computationally expensive imitation learning.Comment: 10 pages. CoNL

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling

Author: Marcheggiani Diego
Titov Ivan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 30/07/2017
Field of study

Semantic role labeling (SRL) is the task of identifying the predicate-argument structure of a sentence. It is typically regarded as an important step in the standard NLP pipeline. As the semantic representations are closely related to syntactic ones, we exploit syntactic information in our model. We propose a version of graph convolutional networks (GCNs), a recent class of neural networks operating on graphs, suited to model syntactic dependency graphs. GCNs over syntactic dependency trees are used as sentence encoders, producing latent feature representations of words in a sentence. We observe that GCN layers are complementary to LSTM ones: when we stack both GCN and LSTM layers, we obtain a substantial improvement over an already state-of-the-art LSTM SRL model, resulting in the best reported scores on the standard benchmark (CoNLL-2009) both for Chinese and English.Comment: To appear in EMNLP 201

arXiv.org e-Print Archive

Edinburgh Research Explorer